An extension 2DPCA based visual feature extraction method for audio-visual speech recognition

نویسندگان

Guanyong Wu

Jie Zhu

چکیده

Two dimensional principal component analysis (2DPCA) has been proposed for face recognition as an alternative to traditional PCA transform [1]. In this paper, we extend this approach to the visual feature extraction for audio-visual speech recognition (AVSR). First, a two-stage 2DPCA transform is conducted to extract the visual features. Then, the visemic linear discriminant analysis (LDA) is applied for post extraction processing. We investigate the presented method comparing with traditional PCA and 2DPCA. Experimental results show that the extension 2DPCA can reduce the dimension of 2DPCA and represent the testing mouth images better than PCA does; Moreover, 2DPCA+LDA needs less computation and has a better performance than PCA+LDA in the visual-only speech recognition; Finally, further experimental results demonstrate that our AVSR system using the extension 2DPCA method provides significant enhancement of robustness in noisy environments compared to the audio-only speech recognition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Audio-Visual Speech Recognition for a Person with Severe Hearing Loss Using Deep Canonical Correlation Analysis

Recently, we proposed an audio-visual speech recognition system based on a neural network for a person with an articulation disorder resulting from severe hearing loss. In the case of a person with this type of articulation disorder, the speech style is quite different from that of people without hearing loss, making a speaker-independent acoustic model for unimpaired persons more or less usele...

متن کامل

Audio-Visual Speech Recognition Using Bimodal-Trained Bottleneck Features for a Person with Severe Hearing Loss

In this paper, we propose an audio-visual speech recognition system for a person with an articulation disorder resulting from severe hearing loss. In the case of a person with this type of articulation disorder, the speech style is quite different from those of people without hearing loss that a speaker-independent acoustic model for unimpaired persons is hardly useful for recognizing it. The a...

متن کامل

Audio-Visual Speech Recognition Using Convolutive Bottleneck Networks for a Person with Severe Hearing Loss

In this paper, we propose an audio-visual speech recognition system for a person with an articulation disorder resulting from severe hearing loss. In the case of a person with this type of articulation disorder, the speech style is quite different from with the result that of people without hearing loss that a speaker-independent model for unimpaired persons is hardly useful for recognizing it....

متن کامل

An Audio-visual Speech Recognition System for Testing New Audio-visual Databases

For past several decades, visual speech signal processing has been an attractive research topic for overcoming certain audio-only recognition problems. In recent years, there have been many automatic speech-reading systems proposed that combine audio and visual speech features. For all such systems, the objective of these audio-visual speech recognizers is to improve recognition accuracy, parti...

متن کامل

Machine learning based Visual Evoked Potential (VEP) Signals Recognition

Introduction: Visual evoked potentials contain certain diagnostic information which have proved to be of importance in the visual systems functional integrity. Due to substantial decrease of amplitude in extra macular stimulation in commonly used pattern VEPs, differentiating normal and abnormal signals can prove to be quite an obstacle. Due to developments of use of machine l...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

An extension 2DPCA based visual feature extraction method for audio-visual speech recognition

نویسندگان

چکیده

منابع مشابه

Audio-Visual Speech Recognition for a Person with Severe Hearing Loss Using Deep Canonical Correlation Analysis

Audio-Visual Speech Recognition Using Bimodal-Trained Bottleneck Features for a Person with Severe Hearing Loss

Audio-Visual Speech Recognition Using Convolutive Bottleneck Networks for a Person with Severe Hearing Loss

An Audio-visual Speech Recognition System for Testing New Audio-visual Databases

Machine learning based Visual Evoked Potential (VEP) Signals Recognition

عنوان ژورنال:

اشتراک گذاری